On Document Classification with Self-Organising Maps

نویسندگان

  • Jyri Saarikoski
  • Kalervo Järvelin
  • Jorma Laurikkala
  • Martti Juhola
چکیده

This research deals with the use of self-organising maps for the classification of text documents. The aim was to classify documents to separate classes according to their topics. We therefore constructed self-organising maps that were effective for this task and tested them with German newspaper documents. We compared the results gained to those of k nearest neighbour searching and k-means clustering. For five and ten classes, the self-organising maps were better yielding as high average classification accuracies as 88-89%, whereas nearest neighbour searching gave 74-83% and k-means clustering 7279% as their highest accuracies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-Organising Maps in Document Classification: A Comparison with Six Machine Learning Methods

This paper focuses on the use of self-organising maps, also known as Kohonen maps, for the classification task of text documents. The aim is to effectively and automatically classify documents to separate classes based on their topics. The classification with self-organising map was tested with three data sets and the results were then compared to those of six well known baseline methods: k-mea...

متن کامل

Text Classification and Labelling of Document Clusters with Self-Organising Maps

The freely available law on the Internet could be one of the best application areas of text classification and labelling. This paper explores the high potential of the self-organising map for information reconnaissance by classifying and describing unknown legal text collections. The maps can be seen as topic-oriented libraries that are automatically created without intellectual input. The clus...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

A study of the use of self-organising maps in information retrieval

Purpose We studied the applicability of self-organising maps for searching for information in a document collection. Design / methodology / approach – After conventional preprocessing, like transform into vector space, documents from a German document collection were trained for a neural network of Kohonen selforganising map type. Such an unsupervised network forms a document map from which rel...

متن کامل

Automatic Classification using Self-Organising Neural Networks in Astrophysical Experiments

Self-Organising Maps (SOMs) are effective tools in classification problems, and in recent years the even more powerful Dynamic Growing Neural Networks, a variant of SOMs, have been developed. Automatic Classification (also called clustering) is an important and difficult problem in many Astrophysical experiments, for instance, Gamma Ray Burst classification, or gamma-hadron separation. After a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009